Orthographic similarity search for dictionary lookup of Japanese words

نویسندگان

  • Lars Yencken
  • Timothy Baldwin
چکیده

Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of character distance metrics to allow learners to leverage known characters to search for words containing unknown but visually similar characters. This new form of dictionary search is implemented as an extension to the FOKS dictionary system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Yencken, Lars and Timothy Baldwin (2008) Orthographic similarity search for dictionary lookup of Japanese words, In Proceedings of the 18th European Conference on Artificial Intelligence (ECAI-08), Patras, Greece

Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of distance metrics for characters to allow learners to leverage known characters to search for words containing unknown b...

متن کامل

Pinyomi: Dictionary lookup via orthographic associations

Bilingual dictionaries provide meaning associations between the words of two languages, those of an ideal bilingual speaker. Learners can use these associations to look up foreign equivalents of known native words, but are forced to use script-based lookup methods when faced with unknown foreign words. This paper presents the Pinyomi Chinese-Japanese dictionary interface, which uses a novel met...

متن کامل

Measuring and Predicting Orthographic Associations: Modelling the Similarity of Japanese Kanji

As human beings, our mental processes for recognising linguistic symbols generate perceptual neighbourhoods around such symbols where confusion errors occur. Such neighbourhoods also provide us with conscious mental associations between symbols. This paper formalises orthographic models for similarity of Japanese kanji, and provides a proofof-concept dictionary extension leveraging the mental a...

متن کامل

Linking English Words in Two Bilingual Dictionaries to Generate Another Language Pair Dictionary

In developing a machine translation system, one of the di cult tasks is how to build a transfer dictionary. It has been built by human labor from scratch in most cases. This approach, however, is very ine ective from the viewpoint of cost and time. To avoid this problem, we generate a Korean to Japanese dictionary as a sample, taking advantage of existing linguistic resources, which consist of ...

متن کامل

Extracting Transliteration Pairs from Comparable Corpora

Transliterating words and names from one language to another is a frequent and highly productive phenomenon. For example, English word cache is transliterated in Japanese asキャッシュ “kyasshu”. In many cases, recent transliterations are not recorded in machine readable dictionaries so it is impossible to rely on dictionary lookup to find transliteration equivalents. In this paper we describe a meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008